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Frederick J. 

ABSTRACT; 

A method of optimizing a computer program for reduced power consumption by 

a 

processor (10) having functional units (lid, lie) that are independently 
controllable by instructions. The processor's instruction set {FIG. 4) has 
instructions that may be directed to a particular functional unit (lid, lie) 
so 

as to place that functional unit in a power-down state while not being used 
during a program segment. 



9 Claims, 8 Drawing figures 
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POWER REDUCTION FOR PROCESSORS method is used with programs written tor a processor having 

BY SOFTWARE CONTROL OF FUNCTIONAL distinct "functional uniLs" to which instructions may be 

UNITS independently directed. The processor's instruction set is 

modi tied so as to provide special *'power-down" insiriiciions 
This application claims priority under 35 USC §1 19(e) 5 that may be directed to one or more functional uniis inrle- 
(1) of Provisional Application No. 60/068,646, filed Dec. 23, pendently of other functional units. I hen, tor each tunc- 
^ggj tional unit, the computer program is scanned to locate 

segments of the program where that Liinctional unit is not 
TECHNICAL FIELD OF THE INVENTION used. Based on the results of the scanning step, power-down 

-niis invention relates to microprocessors, and more par- lo instructions are inserted into the program, such (hat the 
ticularly to methods of using programming instoictions in a hinctional unit uses less power wh.le not >n use. ITie melhod 
manner that reduces the power consumption of a micropro- can be performed manually by an assembly language pro- 

grammer or by a code optimization program. 

An advantage of the invention is that ii provides power 
BACKGROUND OF THE INVENTION 15 management at an "on-chip" levd, as compared u 



cessor. 



:nin- 



Power efficiency for processor-based equipment is puter system level. The level of power management is 
becoming increasingly important as people are becoming "fine-grained", being directed to components withm the 
more attuned to energy conservation issues. Specific con- processor. Power management can he directed even to 
siderations are the reduction of thermal effects and operating ^ functional units withm the processor s CI U. 
costs. Also, apart from energy conservation, power efh- -'^ A further advantage is that when a power-down insiruc- 
ciency is a concern for battery-operated processor-based tion is used in accordance with the invention, the rest of the 
equipment, where it is desired to minimize battery size so processor is fully operational. The program continues to 
that the equipment can be made small and lightweight. The execute as if the instructions were not there, because only 
''processor-based equipment" can be either equipment _ functional unit^ that are noi used are atTected. Thus, when 
designed especially for general computing or equipment inserted into a particular application program, the power- 
having an embedded processor. down instructions operate transparently to the programming 

For the standpoint of processor design, a number of in terms of both function and execution time, 

techniques have been used to reduce power usage. These Selective power management of functional units within a 

techniques can be grouped as two basic strategies. First, the processor facilitates the use of specialized on-chip circuitry, 

processor's circuitry can be designed to use less power. Examples are circuits for performing special functions such 

Second, the processor can be designed in a manner that as lioating point operations, Fourier transfonms, and digital 

permits power usage to be managed. signal filtering. Such circuits can be included on-chip and 

In the past, power management has been primarily at the only consume power when used. ITius, in today's manufac- 

system level! Various "power down" modes have been 35 turing environment, where such circuits are less and less 

implemented, which permit parts of the system, such as a expensive to include on-chip, power consumption need not 

disk drive, display, or the processor itself to be intermittently be a drawback to including them. 

powered down ^ , i BRIEF DESCRIPHON OF THE DRAWINGS 

The entry of a device into a power down mode can be 

initiated in various ways, such as in response to a timer or piQ ] block diagram of an example processor, which 

in response to an instruction from the processor. In the case y^.^^ functional units to which instructions may be indepen- 

of the former, the timer automatically shifts the device into clenlly directed in accordance with I he invention, 

a power down mode after it has been inactive for a preset j.^^, ^ ju^j^^^ates the formal o\ a fetch packet used bv ihc 

period. In the case of the latter, i.e., instruction-implemented pj-ocessc^r of FIG. I. 

powermanagement, various standards have been developed 45 * n.r^ 1 • i r f . ... 

io place power management under processor control. One F'G. 3 is an example ol a letch packet, 

such standard is the Advanced Power Management interface FIG. 4A illustrates one embodiment ot an instruction sei 

specification, developed jointly by Intel and Microsoft. of the processor of FIG. L 

One approach to processor power management is FIG. 4B is a table describing ihe mnemonics of FIG. 4A. 

described in U.S. Pat No. 5,584,031, entitled "System and 50 FIG. 5 illustrates an example of a program code segment 

Method for Executing a Low Power Delay Instruction". A having a power-down insiniclion in accordance with the 

special instruction (a "sleep" opcode) specifies a number of invention. 

timingcyclesduringwhichactivity of the central prrjcessing pjQ ^ illustrates another embodiment of an in.^iruciion 

unit is delayed. set of the processor of FIG. I. 

Another approach to processor power management is 55 ^ illustrates a method of optimizing a computer 

described in U.S. Pat. No. 5,495,617, entitled ''On Demand program to reduce power consumption in accordance with 

Powering of Necessary Portions of Execution Unit by invention. 
Decoding Instruction Word Field Indications Which Unit is 

Required for Execution'*. An instruction decoder differenli- DETAILED DESCRIPTION OF THE 

ales 'control*' instructions firom ''execute" instructions. If so INVENTION 
the instruction is a "control" instruction, it does not involve 

the execution unit, and a standby signal can be sent to the The invention described herein^ is directed power 

execution unit management for microprocessors. Ihe pincessnr s iristaa- 

lion sel is designed in a riiauner thai permiisthe [juigrainiiiu 

SUMMARY OF THE INVEN^HON ^5 ^^^j. compiler) To generate code that will be elljcietil in lenns 

One aspect of the invention is a method of optimizing a of power usage. Special power-down instructions comr<.l the 

computer program for reduced power consumption. The power consumption of circuits internal to the processor. 
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'llie method of Ihe invention can be applied to any type of a direct memory access (DMA) controller 15, external 

processor provided that its instruction set has, or is ame- memory interface 16, host port 17, and power down logic 

nable to ' the type of instructions described herein. The 18. The power down logic 18 can halt CPU activity to reducx- 

common' characteristic of any processor with which the power consumption. There is more than one PO^^"-down 

invention is useftil is thai it has more than one functional 5 mode, which vary by what clocks are still active in the CPU. 

unit, whose activity can be independently controlled by These power-down modes, as well as features ol processor 

uiin, w i^jc J K J 1 1 10 Other lhan l ie features o the present iiiveiilioii, arc 

.nstruchon. In other words, an instrucUon may be selecUvely ^^^^^^^^ ^ ^ Provisional Applicaliun No. 6(1/046,8 1 1 . to 

directed !o a hinctional unit. ^^^^^^^ ^^^j^j^^, Module-Con tigur ah k", l-uli-Chip I'ow.r 

lliese "functional units" are described herein so its to Profiler" filed May 2, 1997 now cihanduncd, serving ;i 

comprise components within the processor's central pro- priority to U.S. patent applicaton Ser. No. 09/066,620 lik-ci 

cessing unit, such as separate datapaths or circuits within ^p^, 24, 1998, a.ssigned to Texas Insiruments Incorporated 

separate datapaths. Additionally, as described below, the j^^j-i incorporated herein by reference, 

functional units may comprise components within the pro- Processor li» executes RISC-like code, and has an assem- 

cessor but peripheral !o its central processing unit, such as i^iy language instruction sei. In other words, each of its 

memory devices or specialized processing units. 15 vLIWs comprises RISC-type instructions. A program wril- 

Tlius, the method of the invention is useful with VLIW ten with these instructions is converted to niachine code by 

(very long instruction word) processors, which are charac- an assembler, or a higher-level program is converted to 

terized by Iheir use of different functional units to execute machine code by a compiler. Processor 10 does not use 

instructions in parallel. The invention is also usefiil with microcode or an internal microcode interpreter, as do some 

^'dual datapath" processors, which use two datapaths to -'^ other processors. However, the invention described herein 

execute instructions in parallel. Both types of processors are could be applicable regardless of whether RISC-like instruc- 

characterized by having more than one functional unit that tions control the processor or whether instructions are tnter- 

operate in parallel (or substantially in parallel). However, the nally interpreted to a lower level. 

invention is also useful with processors having functional In the example of this description, eight 32-bit instruc- 

unils that do not operate in parallel but (hat are "indepen- tions are combined to make the VLIW fetch packet. Thus, in 

dently instruciable" as described in the preceding paragraph. operation, 32-bit instructions are fetched eight at a lime from 

In fact, the invention is most useful in the latter case, where program memory 12, to make a 256-bii instructi(^n word, 

the serial operation of functional units makes it more likely piG. 2 illustrates the basic format of the fetch packet 21) 

that a particular functional unit will not be used during used by processor 10. Each of the eight instructions in fetch 

execution of a program. packet 2(1 is placed in a location referred to as a '"slot" 21. 

In light of the preceding paragraph, the term "processor" Thus, fetch packet 20 has Slots K 2, . . . S. 

as used herein may include various types of micro amtrol- Processor 10 differs from other VLIW proees.sors in thai 

lers and digital signal processors (DSPs), as well as general the entire fetch packet is not nece.ssarily e.>:ecined in one 

purpose computer processors. The following description is CPU cycle. All or part of a fetch packet is exctuied as an 

in terms of DSPs — the TMS320 family of DSPs and a * ''execute packet". In other words, a fetch j)ackct can he fully 

modification of the TMS32pC6x DSP in particular. parallel, fully serial, or partially serial. In the case of a fully 

However, this selection of a particular processor is for or partially serial fetch packet, where the ieieh packet '.^ 

purposes of description and example only. instructions require more than one cycle to execute, the next 

Processor Overview fetch can be postponed. This distinction between fetch 

FIG. 1 is a block diagram of a DSP processor 10. As packets and execute packets permits every fetch packet to 

explainedbelow,processor 10 has a VLIW architecture, and contain eight instructions, without regard to whether they 

fetches muliiple-insiruciion words (as "fetch packets") to be are all to be executed in parallel. 

executed in parallel (as "execute packets") during a single For processor 10, the execution grouping of a fetch packet 

CPU clock cycle. In the example of this description, pro- 20 is specified by a "p-bil" 22 in each instruction. In 

cessor 10 operates at a 5 nanosecond CPU cycle time and " operation, instruction dispatch unit Uh scans the p-biis, and 

executes up to eight instructions every cycle. the state of the p-bit of each instruction determines whether 

Proces.sor 10 has a CPU core U, which has a program the next instruction will be executed in parallel with that 

fetch unit 11^7, and instruction dispatch and decode units lib in.struction. If so, it places the two instructions in the same 

and lie, respectively. To execute the decoded instructions, 50 execute packet to be executed in the same cycle, 

processor 10 has two datapaths lid and lie. FIG. 3 illustrates an example of a fetch packet 20. 

Instruction decode unit He delivers execute packets hav- Whereas FIG. 2 illustrates the formal for the fetch packet 20, 
ing up to eight instructions to the datapath uniLs IW and lie FIG. 3 illastrates an example of instructions thai a fetch 
every clock cycle. Datapaths Ud and lie each include 16 packet 20 might contain. Each of the eight instructions has 
general-purpose registers. Datapaths lid and Ue each also 55 a number of fields, which ultimately are exprcs.sed m bit- 
include four functional units (L, S, M, and D), which are level machine ctKle. The NOP (no operation) in.struction is 
connected to the respective 16 general -purpose registers. a placeholder and has no execution a.s.snciated with it. 
Thus, processor 10 has eight functional units, each of which The || characters signify that an instruction is m execute iti 
may execute one of the instructions in an execute packet. parallel with the previous instruction, and is ci)ded as p-bjt 
Each liinctional unit has a set of instruction types that it is 22. As indicated, fetch packet 20 is fully parallel, and may 
capable of executing. be executed as a single execute packet. 

The control registers 11/ provide the means to configure The square brackets [ ] signify a conditional instruction, 

and control various processor operations. The control logic surrounding the identifier of a condition register. Thus, the 

unit 11^ has logic for control, test, emulation, and interrupt first instruction in FIG. 3 is conditioned on register A2 being 

functions. 65 nonzero. A ! character signifies ''not", so that a condition on 

Processor 10 also comprises program memory 12, data A2 being zero would be expressed as [! A2]. Hie conditional 

memory 13, and timers 14. Its peripheral circuitry includes register field comprises these identifiers. 
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The opfiekl contains an instruction type from the instruc- aliected. In the example ot'this description, the x's could be 

lion set of processor 10. Following the instruction type is the M,L,S, or D. 'llius, a power-down instruction for functional 

designation of the functional unit that will execute the units M and L would be SLEEPML. 

instruction. As stated above in connection with FIG. 1, each As described above, reducing the readiness of a functional 
of the two dalapaths 11^/ and lie has four tiinclional units. 5 unit is done explicitly. Apower-down instruction is placed in 

These functional units are L (logical), S (shift), M the program code at a point where the functional unit enters 

(multiply), and D (data). The op field thus has the syntax a period of no n -use. 

[instruction type], [functional unit identifier]. Restoring the functional unil to a stale of readiness could 
Some instruction types can be performed by only one '^e explicit or implicit. In the case of explicit restoration, a 
functional unit and some can be performed by one of a lo "power-up'' instruction could be a new instruction type, such 
number of them. For example, only the M unit can perform as "\VAKE'\ or a toggle of the power-down instruction. For 
a multiply (MPY). On the other hand, an add (ADD) can be a toggle, the same instruction encountered a first lime would 
performed by the L, S, or D unit. The correspondence of reduce readiness and encountered a second time would 
functional units to instructions is referred to herein as their restore readiness. In the case of implicit restoration, an 
"mapping". ^5 "executable" instruction (an instruction other than SLEEP) 
FIG. 4A is a table illustrating, for processor 10, the »^ ^' ftinctional unit that has previously received a power- 
mapping of instruction types to functional units. FIG. 4B is ^'^'^^ instruction could automatically result in restoration n| 
a table describing the mnemonics of FIG. 4A. its readiness. . • , 

,> . , * A set ol functional units whose rcadmcSN is rvduccU hv 

I^e mapping ot functional umts to instruction types . ^ . , , , , , i . i-.v 
, . . I,', ■ , r , , J „ , ..11^1 20 one instruction could be separately re.storcd at dilkrcni 

determines which instructions can be executed in parallel, . * , , |^ ■ i.i 

, . ,> , 1 ■ -11 1 ,1 limes. Por example, two Ixinclional units, M and I). niii;ht be 

and theretore whether a letch packet will become more than . , .-.-n ■ . ... - ~ 

. 1 T t *u \;t given the SLEEP instruction at the same ponil in a program, 

one execute packet. For example, it only the M unit can ? • i- i . i . c.v . - i 

r K- I /KAn\/\ . 1 . 11 u ^ ^5ut their readiness can he restored at dillerenl times by 

perform a multiply (MPY), an execute packet could have ^^^^^ instructions 

two MPY instructions, one to be executed by each of the two ^ J" V"? . r i u 
, i_ ,1 J in T . . *L T c 11^ *. 25 FIG. 5 il us rales a segment of program code where a 

dalapaths lit/ and He. In contrast, the L, S, and D units are ^ , , - , w ■ . i ■ 

, J /ArM>L\ .1 « runctional unit, L, is not used tor active instructions during 

all capable or executinti an add (ADD), thus an execute , _ , , 

, . ^ - . , a number of execute cycles. J hese execute cycles are 

packet could contain as many as SIX ADD instructions. ^ , 

* . . . ^ . . . , , , , represented by execute packets EPI ... EPN. A power-down 

Reternng ayam to FIG. 3, the mslnicnon s operand held i„„„,„^,„ j.,^,,,^ fun.uonal unil, SLEbl'.L, 

loUows the opheld. Depending on the mstrucl.on lype. the .^^^^^^^ ^ ^p., -j.^^ restoration of L to a 

operand held mav identiry one or more source registers, one i * . ■ - i- i *u . • » r . i 

up,.iaiiu iiv.iw iiia;,. . , , rcudy statc IS implicit, by the nexl active instruction directed 

or more constants, and a destination register. ADD 

To generalize the above -described processor architecture, Selective Power Control of Other Functional Units 
an executable insiruciion word, i.e., an execute packet, As described above, the architecture of processor 10 lends 
contains up to eight lastruciions to be executed ui parallel itself to power-down instructions directed to functional units 
during a CPU cycle. Each instruction m an execute packet ^-^^.^ datapaths Ud and lie. In alternative embodiments, 
uses a different one of the functional units (L, D, S or M) of ^j^^ power-down instructions could be directed to the data- 
datapaths Ud and lie. The msiruction mapping determines ^^^^ .^^^^ ^^^^^^^ functional unit.s, rather than to 
the functional unii(s) to which an instruction may be ^^^^.^ ^^^^^^^.^j f^^^^^,,^,,^ ^^-.^^^ 

directed. The abiUiy lo independently instruct functional j^^^^ concepts can be applied to power coniroi 
units in this manner lends itself to unique techniques for of other functional units within processor 10. For e.xamplc. 
power optimization. As explained below, power-down within CPU U, subsets of ihe control re-isiers 1 1/ could he 
instructions can be used, so that the power consumption of ^^lecied for power modification when not u.^cd. Other corn- 
functional uniLs is independently controlled. ponents suitable for power modification are specialized 

Selective Power Control of Functional Units Referring 45 execution units such as lloaling point units and FFT units 

again to FIG. 4, the instruction set of processor 10 is (^ot shown). Also, portions of memory 12 or memory 13 

comprised of a number of instruction types, each of which ^.^uic] similarly powered down. In general, the inventitm 

may be executed by one or more functional units. These applies to the'seleclive power-modificaiitm of any -func- 

functional units are Qlustrated in FIG. 1, as the L, D, M, and ^j^nal unit'* within proce.ssor 10, where the functional unit 
S units of datapaths lid and llc^ 59 ^lay either directly execute instructions or .^ervc some 

ITie SLEEP instruction type is for ''power-down" instruc- peripheral function, and regardless of whether it is internal 

tioas. It may be u.sed with any of the functional units, M, S, or externa! to the central proce.ssing unit II. 

D, and L. The instruction types other than SLEEP are f\Q 6 illustrates an example of instruction-mapping for 

referred lo herein as "active" instructions. power-down instructions directed to datapaths Vid and He 

During execution of a program, while a functional unit is 55 within CPU 11, as well as to peripheral components within 

not executing active instructions, it may be given the SLEEP processor 10 but outside CPU LL For example, the periph- 

inslruclion. Using the instruction format described above in eral unit might be the external memory interface 16. A 

connection with FIG. 2, the instruction has the following SLEEP instruction type may be directed to any one of these 

op fie Id: functional units. 

SLEER [identifier of functional unit] 60 As further illustrated in FIG. 6, one or more additional 

Only the specified functional unit is affected by the instruc- instruction types may be added to place functional units into 

tion; each other remains in its ready state unless otherwise intermediate power-down levels. In a 'less ready'' state, a 

instructed by its own power- modifying instruction. In the functional unil would consume lessor no power. In a "more 

above example, each opiield controls a single functional ready ' slate, il would consume more power but could be 
unit. However, instruction types could also be provided that 65 more quickly be made ready for use. For example, in 

control more than one functional unit, such as SLEEPx.x.x, addition to the SLEEP in.staiciidn type, the insiniciinn .sei cf 

. . . , where the x's specify which functional unit(s) are to be proces.sor 10 has a RES f instruciii m type. An e.\:iiiink- 
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iraplementaiion of an intermediate power-down instruction 
is one that turns off all circuitry of a l\inciiona! unit other 
than memory that is private to that functional unit. 

Intermediate power-down levels are especially useful to 
avoid delays in restoring power. In other words, after a full 
power-down instruction, a power-up instruction or a non- 
power down instruction might result in a delay that would 
not occur if an intermediate power-down instruction had 
been used. 

Power Optimizing Compiler 

FIG. 7 illustrates the basic steps of an optimization 
process in accordance with the invention. As illustrated, for 
each functional unit, the program code is scanned to identify 
segments where the functional unit is not used. ITie identi- 
fication of an "inactive segment" made in terms of effi- 
ciency. Various power modeling techniques can be u.sed to 
determine the length of time during which it is more efficient 
to turn a component off (or partially ofQ ihen on again versus 
leaving it on. The resulting "power down threshold" might 
be different for different functional units and for different 
power-down levels. 

After an inactive segment is identified, depending on 
factors such as the length of the segment, an appropriate 
power-down instruction is selected. For example, a long 
.segment might call for a fiill power-down instruction 
whereas a shorter segment might call for an intermediate 
power down instruction. The power-down instruction is 
inserted at the beginning of the segment. Depending on the 
processor architecture, a power-up instruction may or may 
not be used. The process is repeated for each functional unit. 

One approach to implementing the method of FIG. 7 is for 
a programmer to manually scan the code and insert power- 
down instructions during the programming process. 
Alternatively, the method could be performed automatically 
by a compiler or assembler. A compiler would have the 
overall function of operating on a higher- level language to 
create power-eflicient machine code for processor U). An 
assembler would operate on assembly code. 

In the c'dSG of either a compiler or assembler, an optimiz- 
ing process finds, for each functional unit, program seg- 
ments during which the l\jnctional unit is not used are 
located. These segments would be of longer duration than 
.some predetermined threshold. Once these segments are 
found, the compiler then inserts a power-modifying instruc- 
tion at the point in the code when the functional unit first 
goes out of use. 

Locating program segments during which a functional 
unit is not used may be done by either static or dynamic 
program analysis. For static analysis, the compiler estimates 
the number of execute cycles between start and stop points, 
which may include an estimation of loop cycles and other 
statistical predictions. Dynamic analysis is performed after 
the code is in an executable form,. such that the compiler may 
run the code and actually mea.sure lime. 

In either case, the compiler locates program segments of 
functional unit non-use. Then, the compiler inserts a power- 
modification instruction in the code. If there are different 
levels of power modification, the compiler compares the 
period of non-u.se with the thresholds of the various power- 
modifying instructions, and inserts the appropriate instruc- 
tion. 

Other Embodiments 

Although the present invention has been described in 
detail, it should be understood that various changes, 
substitutions, and alterations can be made hereto without 
departing from the spirit and scope of the invention as 
defined by the appended claims. 
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What is claimed is: 

1. A method of optimizing a computer program for 
reduced power consumption, where the program is written 
for a processor having distinct functional units, comprising 

5 the steps of: 

during program production prior to completion of said 

computer program identifying at least one .segment of 

said computer program in which at least one functitmal 

unit is not used; and 
based on the results of said identifying .step, inserting a 

power-down instrucliLin at itic bcyiiiniiig nf s:iid NLg- 

meni; 

wherein said power-down instruction is operable lo c;ili.sc 
said at least one functional unit lo use le.ss power durint: 

]5 

execution ol said segment. 

2. The method of claim 1, wherein said identifying step is 
accomplished by statically estimating proce.s.sor cycles prior 
to running said computer program. 

3. The method of claim 1, further comprising the step of 
during program production prior to completion of said 
computer program inserting a power-up instruction in said 
computer program, wherein said power-up insiruclion 
directed to said at least one functional unit is operable lo 
restore said at least one functional unit to a ready state. 

4. The method of claim 1, wherein said power-down 
instruction includes a first power-down instruction operable 
to reduce power lo all of said at least one functional unit, 
such that said tainctional unit is placed in a low .state of 
readiness and a second power-down in.siruction ope rah It lo 
reduce power to only a part of said at least tine funciinnal 
unit, such that said functional unit is placed in an inicrmc- 
diaie state of readiness. 

5. The method of claim 1, wherein said idenii lying siep is 
^, accomplished by comparing the duration of said segment 

wiih a predetermined threshold. 

6. The method of claim I, wherein said power-down 
in.struction has a formal including a first portion ideniifyinu 
said instruction as a power- down instruction and a second 
portion indicating the identity of a single functional unit. 

7. The method of claim 1, wherein said power-down 
instruction has a formal including a first portion identifying 
said instruction as a power- down instruct ion and a second 
portion indicating the identity of a number of functional 
units. 

8. The method of claim 1, wherein: 
said power-down instruction includes 

a first power-down instruction operable lo reduce 
power of said at least one functional unit to a first 
low state and place that .said functional unit in a low 
stale of readiness, and 

a second power- down insiruclion operable lo reduce 
power of said at least one t\jnciional unit to a .second 
low state greater than said first low state but le.ss than 
a ready state and place said Ixinciional unit in an 
intermediate state of readiness greater than said low 
slate of readiness; 
said identifying step includes comparing the diir;iiion o[' 

said .segment with a first predetermined threshold and 
6Q second predetermined threshold .shorter than s;iid lirst 

predetermined threshold; and 
said inserting .step 

inserts said first power-down instruction if said .seg- 
menl has a duration longer than said first predeier- 
65 mined threshold, and 

inserts said second power-down instruction if said 
segment has a duration longer than said .second 
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predetermined threshold and less than said first pre- 
determined threshold. 
9. A Diethod of optinriizing a computer program for 
reduced power consumption, where the program is written 
for a processor having distinct functional units, comprising 
the steps of: 

during program production prior to completion of said 
computer program identifying at least one segment of 
said computer program in which at least one functional 
unit is not used; and 
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based on the results of said identifying step, inserting a 
first power-conlrol instruction at the beginning c»f said 
segment and a second power-control instruction before 
the end of said segment; 

wherein said power-control instruction is operable to 
cause said at least one functional unit to toggle between 
a l\iU power ready stale and a power-down state 
wherein said at least one fimctional unil consumes less 
power than in said ready state. 
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